{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "SqueezeNet_implementation.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "authorship_tag": "ABX9TyOAerzt970EI+cxKxDRH8FV",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/Machine-Learning-Tokyo/CNN-Architectures/blob/master/Implementations/SqueezeNet/SqueezeNet_implementation.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CT4VrPvP_gZq",
        "colab_type": "text"
      },
      "source": [
        "# Implementation of SqueezeNet\n",
        "\n",
        "\n",
        "We will use the [tensorflow.keras Functional API](https://www.tensorflow.org/guide/keras/functional) to build SqueezeNet from the original paper: “[SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size](https://arxiv.org/abs/1602.07360)” by Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer.\n",
        "\n",
        "[Video tutorial](https://www.youtube.com/watch?v=W4UbinapGMY&list=PLaPdEEY26UXyE3UchW0C742xh542yh0yI&index=7)\n",
        "\n",
        "---\n",
        "\n",
        "In the paper we can read:\n",
        "\n",
        ">**[i]** “[...] we implement our expand layer with two separate convolution layers: a layer with 1x1 filters, and a layer with 3x3 filters. Then, we concatenate the outputs of these layers together in the channel dimension.\"\n",
        "\n",
        "<br>\n",
        "\n",
        "We will also make use of the following Table **[ii]**:\n",
        "\n",
        "<img src=https://raw.githubusercontent.com/Machine-Learning-Tokyo/DL-workshop-series/master/Part%20I%20-%20Convolution%20Operations/images/SqueezeNet/SqueezeNet.png? width=\"600\">\n",
        "\n",
        "as well the following Diagrams **[iii]** and **[iv]**\n",
        "\n",
        "<img src=https://raw.githubusercontent.com/Machine-Learning-Tokyo/DL-workshop-series/master/Part%20I%20-%20Convolution%20Operations/images/SqueezeNet/SqueezeNet_diagram.png? width=\"350\">\n",
        "\n",
        "<img src=https://raw.githubusercontent.com/Machine-Learning-Tokyo/DL-workshop-series/master/Part%20I%20-%20Convolution%20Operations/images/SqueezeNet/SqueezeNet_diagram_2.png? width=\"70\">\n",
        "\n",
        "---\n",
        "\n",
        "## Network architecture\n",
        "\n",
        "Based on **[ii]** the network \n",
        "- starts with a Convolution-MaxPool block \n",
        "- continues with a series of **Fire blocks** separated by MaxPool layers \n",
        "- finishes with *Convolution* and *Average Pool* layers.\n",
        "\n",
        "Notice that there is no *Fully Connected* layer in the model which means that the network can process different image sizes.\n",
        "\n",
        "\n",
        "### Fire block\n",
        "\n",
        "The *Fire block* is depicted at **[iii]** and consists of:\n",
        ">1. a 1x1 *Convolution* layer that outputs the `squeezed` tensor\n",
        ">2. a 1x1 *Convolution* layer and a 3x3 *Convolution* layer applied on the *squeeze* tensor and the ouputs of which are then concatenated as described in **[i]**\n",
        "\n",
        "---\n",
        "\n",
        "## Workflow\n",
        "We will:\n",
        "1. import the neccesary layers\n",
        "2. write a helper function for the Fire block (**[iii]**)\n",
        "3. write the stem of the model\n",
        "4. use the helper function to write the main part of the model\n",
        "5. write the last part of the model"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "DfIjiCVv_ir2",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from tensorflow.keras.layers import Input, Conv2D, Concatenate, \\\n",
        "     MaxPool2D, GlobalAvgPool2D, Activation"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wjIEqvpN_j6b",
        "colab_type": "text"
      },
      "source": [
        "### 2. Fire block\n",
        "Next, we will write the Fire block function\n",
        "\n",
        "This function will:\n",
        "- take as inputs:\n",
        "  - a tensor (**`x`**)\n",
        "  - the filters of the 1st 1x1 Convolution layer (**`squeeze_filters`**)\n",
        "  - the filters of the 2nd 1x1 Convolution and the 3x3 Convolution layers (**`expand_filters`**)\n",
        "- run:\n",
        "  - apply a 1x1 conv operation on **`x`** to get **`squeezed`** tensor\n",
        "  - apply a 1x1 conv and a 3x3 conv operation on **`squeezed`**\n",
        "  - *Concatenate* these two tensors\n",
        "- return the concatenated tensor"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "oroh1iwN_oUj",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "def fire_block(x, squeeze_filters, expand_filters):\n",
        "    squeezed = Conv2D(filters=squeeze_filters,\n",
        "                      kernel_size=1,\n",
        "                      activation='relu')(x)\n",
        "    expanded_1x1 = Conv2D(filters=expand_filters,\n",
        "                        kernel_size=1)(squeezed)\n",
        "    expanded_3x3 = Conv2D(filters=expand_filters,\n",
        "                        kernel_size=3,\n",
        "                        padding='same')(squeezed)\n",
        "    x = Concatenate()([expanded_1x1, expanded_3x3])\n",
        "    x = Activation('relu')(x)\n",
        "    return x"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TilniSY__qN6",
        "colab_type": "text"
      },
      "source": [
        "### 3. Model stem\n",
        "Based on **[ii]**:\n",
        "\n",
        "| layer name/type \t| output size \t| filter size / stride \t|\n",
        "|-----------------\t|:-----------:\t|--------------------:\t|\n",
        "| input image     \t| 224x224x3   \t|                      \t|\n",
        "| conv1           \t| 111x111x96  \t| 7x7/2 (x96)          \t|\n",
        "| maxpool1        \t| 55x55x96    \t| 3x3/2                \t|\n",
        "\n",
        "the model starts with:\n",
        ">1. a Convolution layer with 96 filters and kernel size 7x7 applied on a 224x224x3 input image\n",
        ">2. a MaxPool layer with pool size 3x3 and stride 2"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2h3dzyOZ_u2P",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "input = Input([224, 224, 3])\n",
        " \n",
        "x = Conv2D(96, 7, strides=2, padding='same', activation='relu')(input)\n",
        "x = MaxPool2D(3, strides=2, padding='same')(x)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "D5ksGP64_3aP",
        "colab_type": "text"
      },
      "source": [
        "### 4. Main part\n",
        "Based on **[ii]**:\n",
        "\n",
        "| layer name/type \t| filter size / stride \t| s1x1(#1x1 squeeze) \t| e1x1(#1x1 expand) \t| e3x3(#3x3 expand) \t|\n",
        "|-----------------\t|----------------------\t|--------------------\t|-------------------\t|-------------------\t|\n",
        "| fire2           \t|                      \t| 16                 \t| 64                \t| 64                \t|\n",
        "| fire3           \t|                      \t| 16                 \t| 64                \t| 64                \t|\n",
        "| fire4           \t|                      \t| 32                 \t| 128               \t| 128               \t|\n",
        "| maxpool4        \t| 3x3/2                \t|                    \t|                   \t|                   \t|\n",
        "| fire5           \t|                      \t| 32                 \t| 128               \t| 128               \t|\n",
        "| fire6           \t|                      \t| 48                 \t| 192               \t| 192               \t|\n",
        "| fire7           \t|                      \t| 48                 \t| 192               \t| 192               \t|\n",
        "| fire8           \t|                      \t| 64                 \t| 256               \t| 256               \t|\n",
        "| maxpool8        \t| 3x3/2                \t|                    \t|                   \t|                   \t|\n",
        "| fire9           \t|                      \t| 64                 \t| 256               \t| 256               \t|\n",
        "\n",
        "the model continues with:\n",
        ">1. Fire block (fire2) with 16 squeeze and 64 expand filters\n",
        ">2. Fire block (fire3) with 16 squeeze and 64 expand filters\n",
        ">3. Fire block (fire4) with 32 squeeze and 128 expand filters\n",
        ">4. a MaxPool layer (maxpool4) with pool size 3x3 and stride 2\n",
        ">1. Fire block (fire5) with 32 squeeze and 128 expand filters\n",
        ">2. Fire block (fire6) with 48 squeeze and 192 expand filters\n",
        ">3. Fire block (fire7) with 48 squeeze and 192 expand filters\n",
        ">3. Fire block (fire8) with 64 squeeze and 256 expand filters\n",
        ">4. a MaxPool layer (maxpool8) with pool size 3x3 and stride 2\n",
        ">3. Fire block (fire9) with 64 squeeze and 256 expand filters\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "z9tKdXpl_6l5",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "x = fire_block(x, squeeze_filters=16, expand_filters=64)\n",
        "x = fire_block(x, squeeze_filters=16, expand_filters=64)\n",
        "x = fire_block(x, squeeze_filters=32, expand_filters=128)\n",
        "x = MaxPool2D(pool_size=3, strides=2, padding='same')(x)\n",
        " \n",
        "x = fire_block(x, squeeze_filters=32, expand_filters=128)\n",
        "x = fire_block(x, squeeze_filters=48, expand_filters=192)\n",
        "x = fire_block(x, squeeze_filters=48, expand_filters=192)\n",
        "x = fire_block(x, squeeze_filters=64, expand_filters=256)\n",
        "x = MaxPool2D(pool_size=3, strides=2, padding='same')(x)\n",
        " \n",
        "x = fire_block(x, squeeze_filters=64, expand_filters=256)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ur9xKyDe_-34",
        "colab_type": "text"
      },
      "source": [
        "### 5. Last part\n",
        "Based on **[ii]**:\n",
        "\n",
        "| layer name/type \t| filter size / stride \t|\n",
        "|-----------------\t|----------------------\t|\n",
        "| conv10          \t| 1x1/1 (x1000)        \t|\n",
        "| avgpool10       \t| 13x13/1              \t|\n",
        "\n",
        "the model ends with:\n",
        ">1. a Convolution layer with 1000 filters and kernel size 1x1\n",
        ">2. a Average Pool layer with stride 1 which based on **[iv]** is *Global*\n",
        ">3. a *Softmax* activation applied on the output number (**[iv]**)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "zvQj2rOx__zl",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "x = Conv2D(filters=1000, kernel_size=1)(x)\n",
        "x = GlobalAvgPool2D()(x)\n",
        " \n",
        "output = Activation('softmax')(x)\n",
        " \n",
        "from tensorflow.keras import Model\n",
        "model = Model(input, output)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "-bi8brXyDMbk",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from tensorflow.keras.utils import plot_model\n",
        "plot_model(model, show_shapes=True)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zKVQrvi_ALWb",
        "colab_type": "text"
      },
      "source": [
        "### Check number of parameters\n",
        "\n",
        "We can also check the total number of parameters of the model by calling `count_params()` on each result element of `model.trainable_weights`.\n",
        "\n",
        "According to **[ii]** (col: #parameter before pruning) there are 1,248,424 (total) parameters at SqueezeNet model."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "zKObsHRmALvs",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "import numpy as np\n",
        "import tensorflow.keras.backend as K\n",
        "int(np.sum([K.count_params(p) for p in model.trainable_weights]))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aowZJV1AADhi",
        "colab_type": "text"
      },
      "source": [
        "## Final code"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "m4qWgjVPAERo",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from tensorflow.keras.layers import Input, Conv2D, Concatenate, \\\n",
        "     MaxPool2D, GlobalAvgPool2D, Activation\n",
        " \n",
        " \n",
        "def fire_block(x, squeeze_filters, expand_filters):\n",
        "    squeezed = Conv2D(filters=squeeze_filters,\n",
        "                      kernel_size=1,\n",
        "                      activation='relu')(x)\n",
        "    expanded_1x1 = Conv2D(filters=expand_filters,\n",
        "                        kernel_size=1,\n",
        "                        activation='relu')(squeezed)\n",
        "    expanded_3x3 = Conv2D(filters=expand_filters,\n",
        "                        kernel_size=3,\n",
        "                        padding='same',\n",
        "                        activation='relu')(squeezed)\n",
        " \n",
        "    output = Concatenate()([expanded_1x1, expanded_3x3])\n",
        "    return output\n",
        " \n",
        " \n",
        "input = Input([224, 224, 3])\n",
        " \n",
        "x = Conv2D(96, 7, strides=2, padding='same', activation='relu')(input)\n",
        "x = MaxPool2D(3, strides=2, padding='same')(x)\n",
        " \n",
        " \n",
        "x = fire_block(x, squeeze_filters=16, expand_filters=64)\n",
        "x = fire_block(x, squeeze_filters=16, expand_filters=64)\n",
        "x = fire_block(x, squeeze_filters=32, expand_filters=128)\n",
        "x = MaxPool2D(pool_size=3, strides=2, padding='same')(x)\n",
        " \n",
        "x = fire_block(x, squeeze_filters=32, expand_filters=128)\n",
        "x = fire_block(x, squeeze_filters=48, expand_filters=192)\n",
        "x = fire_block(x, squeeze_filters=48, expand_filters=192)\n",
        "x = fire_block(x, squeeze_filters=64, expand_filters=256)\n",
        "x = MaxPool2D(pool_size=3, strides=2, padding='same')(x)\n",
        " \n",
        "x = fire_block(x, squeeze_filters=64, expand_filters=256)\n",
        " \n",
        " \n",
        "x = Conv2D(filters=1000, kernel_size=1)(x)\n",
        "x = GlobalAvgPool2D()(x)\n",
        " \n",
        "output = Activation('softmax')(x)\n",
        " \n",
        "from tensorflow.keras import Model\n",
        "model = Model(input, output)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ww1uYRmh_KAK",
        "colab_type": "text"
      },
      "source": [
        "## Model diagram\n",
        "\n",
        "<img src=\"https://raw.githubusercontent.com/Machine-Learning-Tokyo/CNN-Architectures/master/Implementations/SqueezeNet/SqueezeNet_diagram.svg?sanitize=true\">"
      ]
    }
  ]
}